
FORM 1 (cDNA sequence provided) : 

1 ATGGACAGAG TTTATGAAAT TCCTGAGGAG CCAAATGTGG ATCCGGTTTC 

51 ATCTCTGGAG GAAGATGTCA TCCGTGGAGC CAACCCCCGA TTTACTTTTC 

101 CATTTAGCAT CCTTTTCTCC ACCTTTTTGT ACTGTGGGGA GGCTGCATCT 

151 GCTTTGTACA TGGTTAGAAT CTATCGAAAG AATAGTGAAA CTTACCGGAT 

201 GACATACACC TTTTCTTTCT TTATGTTTTC ATCCATTATG GTCCAGTTGA 

251 CCCTCATTTT TGTCCACAGA GATCTAGCCA AAGATAAACC GCTATCATTA 

301 TTTATGCATC TAATCCTCTT GGGACCTGTT ATCAGATGTT TGGAGGCCAT 

351 GATTAAGTAC CTCACACTGT GGAAGAAAGA GGAGCAGGAG GAGCCCTATG 

401 TCAGCCTCAC CCGAAAGAAG ATGCTAATAG ATGGCGAGGA GGTGCTGATA 

4 51 GAATGGGAGG TGGGCCACTC CATCCGGACC CTGGCTATGC ACCGCAATGC 

501 CTACAAACGT ATGTCACAGA TCCAAGCCTT CCTGGGCTCA GTGCCCCAGC 

551 TGACCTATCA GCTCTATGTG AGCCTGATCT CTGCAGAGGT TCCCCTGGGT 

601 AGAGTTGTGC TAATGGTATT TTCCCTGGTA TCTGTCACCT ATGGGGCCAC 

651 CCTTTGCAAT ATGTTGGCTA TCCAGATCAA GTACGATGAC TACAAGATTC 

701 GCCTTGGGCC ACTAGAAGTC CTCTGCATCA CCATCTGGCG GACATTGGAG 

751 ATCACTTCCC GCCTCCTGAT TCTGGTGCTC TTCTCAGCCA CTTTGAAATT 

801 GAAGGCTGTG CCCTTCCTAG TGCTCAACTT CCTGATCATC CTCTTTGAGC 

851 CCTGGATTAA GTTCTGGAGA AGTGGTGCCC AGATGCCCAA TAACATTGAG 

901 AAAAACTTCA GCCGGGTCGG CACTCTGGTG GTCCTGATTT CAGTCACCAT 

951 CCTCTATGCT GGCATCAACT TCTCTTGCTG GTCAGCTTTG CAGTTGAGGT 

1001 TGGCAGACAG AGATCTCGTC GACAAAGGGC AGAACTGGGG ACATATGGGC 

1051 CTGCACTATA GTGTGAGGTT GGTAGAGAAT GTGATCATGG TCTTGGTTTT 

1101 TAAGTTCTTT GGAGTGAAAG TGTTACTGAA TTACTGTCAT TCCTTGATTG 

1151 CCTTGCAGCT CATTATTGCT TATCTGATTT CCATTGACTT CATGCTCCTT 

1201 TTCTTCCAGT ACTTGCATCC ATTGCGCTCA CTCTTCACCC ATAATGTAGT 

1251 AGACTACCTC CATTGTGTCT GCTGTCACCA GCACCCTCGG ACCAGGGTTG 

1301 AGAACTCAGA GCCACCCTTT GAGACTGAAG CAAGGCAAAG TGTTGTCTGA 



FEATURES : 

Start Codon: 1 

Stop Codon: 1348 

3'UTR: 1351 



FORM 2 {transcript s 



1 


ATGAACACAA 


51 


TTATGAAATT 


101 


AAGATGTCAT 


151 


CTTTTCTCCA 


201 


GGTTAGAATC 


251 


TTTCTTTCTT 


301 


GTCCACAGAG 


351 


AATCCTCTTG 


401 


TCACACTGTG 


451 


CGAAAGAAGA 


501 


GGGCCACTCC 


551 


TGTCACAGAT 


601 


CTCTATGTGA 


651 


AATGGTATTT 


701 


TGTTGGCTAT 


751 


CTAGAAGTCC 


801 


CCTCCTGATT 


851 


CCTTCCTAGT 


901 


TTCTGGAGAA 


951 


CCGGGTCGGC 


1001 


GCATCAACTT 


1051 


GATCTCGTCG 


1101 


TGTGAGGTTG 


1151 


GAGTGAAAGT 


1201 


ATTATTGCTT 


1251 


CTTGCATCCA 


1301 


ATTGTGTCTG 


1351 


CCACCCTTTG 



quence provided) : 
GACCACAACA TTCAGAAAGA 
CCTGAGGAGC CAAATGTGGA 
CCGTGGAGCC AACCCCCGAT 
CCTTTTTGTA CTGTGGGGAG 
TATCGAAAGA ATAGTGAAAC 
TATGTTTTCA TCCATTATGG 
ATCTAGCCAA AGATAAACCG 
GGACCTGTTA TCAGATGTTT 
GAAGAAAGAG GAGCAGGAGG 
TGCTAATAGA TGGCGAGGAG 
ATCCGGACCC TGGCTATGCA 
CCAAGCCTTC CTGGGCTCAG 
GCCTGATCTC TGCAGAGGTT 
TCCCTGGTAT CTGTCACCTA 
CCAGATCAAG TACGATGACT 
TCTGCATCAC CATCTGGCGG 
CTGGTGCTCT TCTCAGCCAC 
GCTCAACTTC CTGATCATCC 
GTGGTGCCCA GATGCCCAAT 
ACTCTGGTGG TCCTGATTTC 
CTCTTGCTGG TCAGCTTTGC 
ACAAAGGGCA GAACTGGGGA 
GTAGAGAATG TGATCATGGT 
GTTACTGAAT TACTGTCATT 
ATCTGATTTC CATTGGCTTC 
TTGCGCTCAC TCTTCACCCA 
CTGTCACCAG CACCCTCGGA 
AGACTGAAGC AAGGCAAAGT 



ACCTCGACAA TGGACAGAGT 
TCCGGTTTCA TCTCTGGAGG 
TTACTTTTCC ATTTAGCATC 
GCTGCATCTG CTTTGTACAT 
TTACTGGATG ACATACACCT 
TCCAGTTGAC CCTCATTTTT 
CTATCATTAT TTATGCATCT 
GGAGGCCATG ATTAAGTACC 
AGCCCTATGT CAGCCTCACC 
GTGCTGATAG AATGGGAGGT 
CCGCAATGCC TACAAACGTA 
TGCCCCAGCT GACCTATCAG 
CCCCTGGGTA GAGTTGTGCT 
TGGGGCCACC CTTTGCAATA 
ACAAGATTCG CCTTGGGCCA 
ACATTGGAGA TCACTTCCCG 
TTTGAAATTG AAGGCTGTGC 
TCTTTGAGCC CTGGATTAAG 
AACATTGAGA AAAACTTCAG 
AGTCACCATC CTCTATGCTG 
AGTTGAGGTT GGCAGACAGA 
CATATGGGCC TGCACTATAG 
CTTGGTTTTT AAGTTCTTTG 
CCTTGATTGC CTTGCAGCTC 
ATGCTCCTTT TCTTCCAGTA 
TAATGTAGTA GACTACCTCC 
CCAGGGTTGA GAACTCAGAG 
GTTGTCTGA 
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HOMOLOGOUS PROTEINS: 

Top BLAST Hits: 

gi I 6502963 igb| AAF14527.il AF155511_1 (AF155511) KX antigen [Mus 
gi 1 10835267 | ref I NP_066569 . 1 1 Kell blood group precursor (McLeod 
gi|21356061pir| 1139294 McLeod syndrome-associated protein XK - 
gi|3183551|sp|P51811|XK_HUMAN MEMBRANE TRANSPORT PROTEIN XK (KX 
gi 14759330 | ref |NP_004668 . 1 | Testis-specif ic XK- related protein 

BLAST to dbEST: 



gi 1 1891549 /dataset=dbest /taxon=9606 



Score 


E 


366 


e-100 


361 


le-98 


358 


8e-98 


358 


le-97 


76 


8e-13 


Score 


E 


383 


e-104 



EXPRESSION INFORMATION FOR MODULATORY USE: 

library source: 

Expression information from BLAST dbEST hits: 
gi 1 1891549 Germinal center B cells 

Expression information from PCR-based tissue screening panels: 
Mixed tissue 
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FORM 1: 



1 MDRVYEIPEE PNVDPVSSLE EDVIRGANPR FTFPFSILFS TFLYCGEAAS 

51 ALYMVRIYRK NSETYRMTYT FSFFMFSSIM VQLTLIFVHR DLAKDKPLSL 

101 FMHLILLGPV IRCLEAMIKY LTLWKKEEQE EPYVSLTRKK MLIDGEEVLI 

151 EWEVGHSIRT LAMHRNAYKR MSQIQAFLGS VPQLTYQLYV SLISAEVPLG 

201 RWLMVFSLV SVTYGATLCN MLAIQIKYDD YKIRLGPLEV LCITIWRTLE 

251 ITSRLLILVL FSATLKLKAV PFLVLNFLII LFEPWIKFWR SGAQMPNNIE 

301 KNFSRVGTLV VLISVTILYA GINFSCWSAL QLRLADRDLV DKGQNWGHMG 

351 LHYSVRLVEN VIMVLVFKFF GVKVLLNYCH SLIALQLIIA YLISIDFMLL 

4 01 FFQYLHPLRS LFTHNWDYL HCVCCHQHPR TRVENSEPPF ETEARQSW 



1 MNTRPQHSER TSTMDRVYEI PEEPNVDPVS SLEEDVIRGA NPRFTFPFSI 

51 LFSTFLYCGE AASALYMVRI YRKNSETYWM TYTFSFFMFS SIMVQLTLIF 

101 VHRDLAKDKP LSLFMHLILL GPVIRCLEAM IKYLTLWKKE EQEEPYVSLT 

151 RKKMLIDGEE VLIEWEVGHS IRTLAMHRNA YKRMSQIQAF LGSVPQLTYQ 

201 LYVSLISAEV PLGRWLMVF SLVSVTYGAT LCNMLAIQIK YDDYKIRLGP 

251 LEVLCITIWR TLEITSRLLI LVLFSATLKL KAVPFLVLNF LIILFEPWIK 

301 FWRSGAQMPN NIEKNFSRVG TLWLISVTI LYAGINFSCW SALQLRLADR 

351 DLVDKGQNWG HMGLHYSVRL VENVIMVLVF KFFGVKVLLN YCHSLIALQL 

401 IIAYLISIGF MLLFFQYLHP LRSLFTHNW DYLHCVCCHQ HPRTRVENSE 

451 PPFETEARQS W 



FEATURES : 

Functional domains and key regions : 

[1] PDOC00001 PS00001 ASN_GLYCOSYLATION 
N-glycosylation site 

Number of matches: 2 

1 302-305 NFSR 

2 323-326 NFSC 

[2] PDOC00004 PS00004 CAMP_PHOSPHO_SITE 

cAMP- and cGMP-dependent protein kinase phosphorylation site 

Number of matches: 2 

1 59-62 RKNS 

2 169-172 KRMS 

[3] PDOC00005 PS00005 PKC_PHOSPHO_SITE 
Protein kinase C phosphorylation site 

Number of matches: 6 



2 137-139 TRK 

3 157-159 SIR 

4 252-254 TSR 

5 264-266 TLK 

6 354-356 SVR 

[4] PDOC00006 PS00006 CK2_PHOSPHO_SITE 
Casein kinase II phosphorylation site 

Number of matches: 3 

1 17-20 SSLE 

2 18-21 SLEE 

3 431-434 TRVE 



FORM 2: 



1 



64-66 TYR 



FIGURE 




[5) PDOC00007 PS00007 TYR_PHOSPHO_SITE 
Tyrosine kinase phosphorylation site 

126-133 KEEQEEPY 

[6] PDOC00008 PS00008 MYRISTYL 
N-myristoylation site 

Number of matches: 4 

1 215-220 GATLCN 

2 321-326 GINFSC 

3 343-348 GQNWGH 

4 350-355 GLHYSV 

[7] PDOC00029 PS00029 LEUCINE_ZIPPER 
Leucine zipper pattern 

100-121 LFMHLILLGPVIRCLEAMIKYL 



Membrane spanning structure and domains : 
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Begin 


End 


Score 


Certainty 


1 


36 


56 


1. 


.443 


Certain 


2 


74 


94 


2. 


.084 


Certain 


3 


102 


122 


0. 


.920 


Putative 


4 


181 


201 


0. 


.811 


Putative 


5 


208 


228 


1. 


.744 


Certain 


6 


273 


293 


1. 


.234 


Certain 


7 


312 


332 


1. 


.785 


Certain 


8 


366 


386 


0, 


.828 


Putative 


9 


389 


409 


1. 


.497 


Certain 
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BLAST Alignment to Top Hit: 

>gi I 6502963 |gb|AAF14527.1|AF155511_l (AF155511) KX antigen [Mus 
musculus] Length = 44 6 

Score = 366 bits (930), Expect - e-106 

Identities = 179/411 (43%), Positives = 265/411 (63%), Gaps = 11/411 (2%) 

Query: 33 FPFSILFSTFLYCGEAASALYMVRIYRKNSETYRMTYTFSFFMFSSIMVQLTLIFVHRDL 92 

FP S++ S FL+ E A+ALY+ YR + T F + +VQ TL+FVHRDL 

Sbjct: 3 FPASVIASVFLFVAETAAALYLSSTYRSAGDRMWQVLTLLFSLMPCALVQFTLLFVHRDL 62 

Query: 93 AKDKPLSLFMHLILLGPVIRCLEAMIKYLTLWKKEEQEEPYVSLTRKKMLI-DGEEVLIE 151 

++D+PL+L MHL+ LGP+ RC E Y + ++ EEPYVS+T+K+ + DG +E 
Sbjct: 63 SRDRPIiALLMHLLQLGPLYRCCEVFCIYC QSDQNEEPYVSITKKRQMPKDGLSEEVE 119 

Query: 152 WEVGHSIRTLAMHRNAYKRMSQIQAFLGSVPQLTYQLYVSLISAEVPLGRWLMVFSLVS 211 

EVG + L HR+A+ R S IQAFLGS PQLT QLY++++ + GR +M SL+S 
Sbjct: 120 KEVGQAEGKLITHRSAFSRASVIQAFLGSAPQLTLQLYITVLEQNITTGRCFIMTLSLLS 17 9 

Query: 212 VTYGATLCNMLAIQIKYDDYKIRLGPLEVLCITIWRTLEITSRLLILVLFSATLKLKAVP 271 

+ YGA CN+LAI+IKYD+Y++++ PL +CI +WR+ EI +R+++LVLF++ LK+ V 
Sbjct: 180 IVYGALRCNILAIKIKYDEYEVKVKPLAYVCIFLWRSFEIATRVIVLVLFTSVLKIWWA 239 

Query: 272 FLVLNFLIILFEPWIKFWRSGAQMPNNIEKNFSRVGTLWLISVTILYAGINFSCWSALQ 331 

+++NF PWI FW SG+ P NIEK SRVGT +VL +T+LYAGIN CWSA+Q 

Sbjct: 240 VILVNFFSFFLYPWIVFWCSGSPFPENIEKALSRVGTTIVLCFLTLLYAGINMFCWSAVQ 299 

Query: 332 LRLADRDLVDKGQNWGHMGLHYSVRLVENVIMVLVFKFFGVKVLLNYCHSLIALQLIIAY 391 

L++ + +L+ K QNW + ++Y R +EN +++L++ FF + + C L+ LQL+I Y 
Sbjct: 300 LKIDNPELISKSQNWYRLLIYYMTRFIENSVLLLLWYFFKTDIYMYVCAPLLILQLLIGY 359 

Query: 392 LISIDFMLLFFQYLHPLRSLFTHNWD YLHCVCCHQHPRTRVENSEP 438 

I FML+F+Q+ HP + LF+ +V + L C C R ++SEP 

Sbjct: 360 CTGILFMLVFYQFFHPCKKLFSSSVSESFRALLRCACWSS LRRKSSEP 407 



ALIGNMENT OF FORM 1 AND FORM 2: 

>FORM 2 

Length =4 62 (Length of FORM 1 = 44 9) 
Score = 900 bits (2301), Expect = 0.0 
Identities = 447/449 (99%), Positives = 447/449 (99%) 



FORM 


1: 


1 


MDRVYEIPEEPNVDPVSSLEEDVIRGANPRFTFPFSILFSTFLYCGEAASALYMVRIYRK 


60 








MDRVYEIPEEPNVDPVSSLEEDVIRGANPRFTFPFSILFSTFLYCGEAASALYMVRIYRK 




FORM 


2: 


14 


MDRVYEIPEEPNVDPVSSLEEDVIRGANPRFTFPFSILFSTFLYCGEAASALYMVRIYRK 


73 


FORM 


1: 


61 


NSETYRMTYTFSFFMFSSIMVQLTLIFVHRDLAKDKPLSLFMHLILLGPVIRCLEAMIKY 


120 








NSETY MTYTFSFFMFSSIMVQLTLIFVHRDLAKDKPLSLFMHLILLGPVIRCLEAMIKY 




FORM 


2: 


74 


NSETYWMTYTFSFFMFSSIMVQLTLIFVHRDLAKDKPLSLFMHLILLGPVIRCLEAMIKY 


133 


FORM 


1: 


121 


LTLWKKEEQEEPYVSLTRKKMLIDGEEVLIEWEVGHSIRTLAMHRNAYKRMSQIQAFLGS 


180 








LTLWKKEEQEEPYVSLTRKKMLIDGEEVLIEWEVGHSIRTLAMHRNAYKRMSQIQAFLGS 




FORM 


2: 


134 


LTLWKKEEQEEPYVSLTRKKMLIDGEEVLIEWEVGHSIRTLAMHRNAYKRMSQIQAFLGS 


193 


FORM 


1: 


181 


VPQLTYQLYVSLISAEVPLGRWLMVFSLVSVTYGATLCNMLAIQIKYDDYKIRLGPLEV 


240 








VPQLTYQLYVSLISAEVPLGRWLMVFSLVSVTYGATLCNMLAIQIKYDDYKIRLGPLEV 




FORM 


2: 


194 


VPQLTYQLYVSLISAEVPLGRWLMVFSLVSVTYGATLCNMLAIQIKYDDYKIRLGPLEV 


253 


FORM 


1: 


241 


LCITIWRTLEITSRLLILVLFSATLKLKAVPFLVLNFLIILFE PWIKFWRSGAQMPNNIE 


300 








LCITIWRTLEITSRLLILVLFSATLKLKAVPFLVLNFLIILFEPWIKFWRSGAQMPNNIE 




FORM 


2: 


254 


LCITIWRTLEITSRLLILVLFSATLKLKAVPFLVLNFLIILFE PWIKFWRSGAQMPNNIE 


313 


FORM 


1: 


301 


KNFSRVGTLWLISVTILYAGINFSCWSALQLRLADRDLVDKGQNWGHMGLHYSVRLVEN 


360 



KNFSRVGTLWLISVTILYAGINFSCWSALQLRLADRDLVDKGQNWGHMGLHYSVRLVEN 
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FORM 2: 314 KNFSRVGTLWLISVTILYAGINFSCWSALQLRLADRDLVDKGQNWGHMGLHYSVRLVEN 373 

FORM 1: 361 VIMVLVFKFFGVKVLLNYCHSLIALQLIIAYLISIDFMLLFFQYLHPLRSLFTHNWDYL 420 

VIMVLVFKFFGVKVLLNYCHSLIALQLIIAYLISI FMLLFFQYLHPLRSLFTHNWDYL 
FORM 2: 374 VIMVLVFKFFGVKVLLNYCHSLIALQLIIAYLISIGFMLLFFQYLHPLRSLFTHNWDYL 433 

FORM 1: 421 HCVCCHQHPRTRVENSEPPFETEARQSW 449 

HCVCCHQHPRTRVENSEPPFETEARQSW 
FORM 2: 434 HCVCCHQHPRTRVENSEPPFETEARQSW 462 



Hmmer search results (Pfam) : 

Model Description Score E-value N 

CE00306 E00306 Membrane_transport_protein_XK 390.8 1.3e-113 1 

Parsed for domains: 

Model Domain seq-f seq-t hmm-f hmm-t score E-value 

CE00306 1/1 31 416 1 384 [. 390.8 1.3e-113 
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1 TATTATTATT ATTATTAAGA CGTAATCTTG CTCTGTTGCC CAGGCTGGAG 

51 TGCAGTGGCG TGATCTCAGC TCACTGCAAC CTCTGCCGTC CGGGTTCAAG 

101 TTTTTCTCCT GCCTCAGCCT CCTGAGTAGC TGGGATTACA GTCACGCACC 

151 ACCACGACCA GCTGATTTTT GTATTTTTAG TAGAGATGGG GTTTCACCAC 

201 GTTGGCCAGG CTGGTTTCGA ACTCCTGACC TCAAGTGATC TGCCTGCCTC 

251 AGCCTCCCAA AGTGCTGGGA TTACAGGCGT GAACCACTGT GCCTGGCCTT 

301 CATCTATATT ATTACCAGGA GGCAGATGTG TTCTCTTTTT CTCTGAGGTT 

351 TAGAATTATG CAAATGAAGA TATGAAAACA AAAGCTCAGT GAGGTGGGGA 

4 01 GGATTACACT TAAGAATACA GGTAATTTTC AAAGCTCTTT AAGACACCCC 

451 TCTCAGTTTT TACTAACAGC TCTCTCTTGG CTCTTTGCCA GTCTGTTTAG 

501 AATTTGGCAC CTCTTCATAA CCTTTCAACC AAAGACCTGT AAGTTCATTC 

551 TAAAGCTCCT ATCCTGGCCT CATTTTGCAA GTGGAGAAAT CAAGGCATAA 

601 AATATGAGCT TTCAGTGTCT GTGGGCTGAC CTTGAGTCTT GACCTTTATC 

651 CTGTTCTATC TTCCCTCCGC CGAAAACTCT GACCCTATTC CTCCCAGGTT 

701 CCCCCTTCAT GAT ATT AT CT GGAGGGCAAT AGGACCTAGG GAGGTTCCAC 

751 CCTGCGGCGG AGGGAGACAC ACCTGCCTAA CAGCGTGGGT AGAGTGAGTG 

801 TTGAAGCAAG TCACTTAACT AGTTAGGGAG GGCGGGGTAG AAGTGGGGGC 

851 CTGCTGCTCC TAGGGAGGAG TAAAGCTGTG GCTCCTGCCT GGGTCTGGAG 

901 GTGGTGGTCA GAAGTGCTTC TGAAGAGCGG CCCAAGCCCC TTTTTGTCCC 

951 GCCACTCCAC AACGAGCATC CCTCGGCTGG CCGCCTGCCC GGGAACTCTC 

1001 CGGCTGGTTT TGTTTGGCCG CAGCCGTCCC GCCCATCTCG CCCGCCCCCG 

1051 CCGTCCCGGT GCCTTAGTTT TTGAAGCTGC CGACCTCTCG CAGCTGGAAT 

1101 CGCAGACCAG GCAGGACCCT GGCAGCAGAC GGCGTCCAAG AGTTTGGCGA 

1151 CCTCCGTCCA GCCAGGTTGG CGCCCCGCAC ATCGTGCCTC TCACTAGCAA 

1201 AGTTTCTCCG AGGAGAAGCA GCCCCTCCAG CCTTTTCTTC ATCCTGTAGA 

1251 GCGAGCGCGC TCTGCTTCTG TCCCTCAACA CTGCATTCGG AGACAGGGTG 

1301 GTGACAATAC TCCACTCCCG GGCCAGGCGG TCTTGGGGGC GGGGCTTGGG 

1351 GGAATCCGAG GAGCTATCCT GAGAACCCTG GACTCGGCAA AGGTCCTGAG 

1401 AGCGCGCAGG TGAGCGGGCC AGCTGATAGC TACAGCCTAG CAATAGCTAG 

1451 GATACCTAGG CACTGAACTG AATCCCCTCT TCTGCCCTCC TTCTTCTGCG 

1501 CCCGCTCTTC TGCCCTGGCT CAGCTCTCCG CTGACTTGAG AGGACACACT 

1551 GGTCAGGACT CTTTGTGAGG AGCTGCTGAG TGTCGGTGCC CCCGACAGAT 

1601 CGGCTACACC CTGCCTGAGG GGCTGCGAAA GGAGCCGCCA CGGAAGCCGC 

1651 TGTTCTCATG ACTCTTCACG TCCCTGGAGT TGGACTCTGG ATGGGGCGCT 

1701 GGGATGCTTG CTTTTGTCTT GTTCAAGTTT CACAGCAAGT ATGTTGACGA 

1751 TTGGAATCGG GGCCAATCAA GAGTCAAGTT CAAAGTGGTA CTCCTGGGCT 

1801 TTCCATCCCA GACTCCAAGT CGAATCTGAG TCTAGAAGAG AGCGGTTTCT 

1851 TGCTCTAACT AGTGAATCTC TGTTCCCAAA CTGGACTTGA CAGAGCTCTC 

1901 CTCACCTATA CTTGGACTGT AGCGGCCATA GGGTTCTCTT GGGGATGGGT 

1951 GGGAGGGTGC TATGAACACA AGACCACAAC ATTCAGAAAG AACCTCGACA 

2001 ATGGACAGAG TTTATGAAAT TCCTGAGGAG CCAAATGTGG ATCCGGTTTC 

2051 ATCTCTGGAG GAAGATGTCA TCCGTGGAGC CAACCCCCGA TTTACTTTTC 

2101 CATTTAGCAT CCTTTTCTCC ACCTTTTTGT ACTGTGGGGA GGCTGCATCT 

2151 GCTTTGTACA TGGTTAGAAT CTATCGAAAG AATAGTGAAA CTTACTGGAT 

2201 GACATACACC TTTTCTTTCT TTATGTTTTC ATCCATTATG GTCCAGTTGA 

2251 CCCTCATTTT TGTCCACAGA GATCTAGCCA AAGATAAACC GCTATCATTA 

2301 TTTATGCATC TAATCCTCTT GGGACCTGTT ATCAGGTGAG CAACTTTTAA 

2351 ATCTTTTCCT TACCCCCCTA ACCCCACCCC AGACTTGGGC AGAGAAAGAT 

24 01 GAAAGATTTA CAAGATGGAT ACTATGGCTC TAATCAATTC TCTCATTTCC 

24 51 TCCCACTCTC GGCTTCCCTG TCTACCATTC AGAAAACTTA CCTGAAATCT 

2501 TAAATGCCAC CATGATGAAC ATGTGGTATG TACTTGTGTT CCAAAACAAT 

2551 GAACGATGCT ATTTGGGCTG TGTAAACTAG AATGGGAACA ACAAGACGTG 

2601 ATCACCCTGT GCATGAAGGC CATAGCTGCA GAGTGTGTAA TTTTATTTAA 

2651 AAAAATTTTT TTTTCTGAGA CAAGGTCTTG CTCTGCCTCC CAGGCTACAG 

2701 TGCAGTGGTG CGATCATGGC TCACTGCAGC CTTGATCTCC TGGGATCAAG 

2751 CGAACCTCCC ACCTCAGCCT CCAAGTAGCT GGGACCAAAG GAATGTGTCA 

2801 CCATGCCTGG TTAATTAAAA AAAAATTTTT ATAGGCCGGG TGTGGTGGCT 

2851 CATGCCTGTA ATCCCAGCAC TTTGGGAGGC TGAGGCGGGT GGATCACCTG 

2901 AGGTCAGGAG TTCAAGACCA GCTGGCCAAC ATGGTGAAAC CCCTGTCTCT 

2951 ACTAAAAATC AGCTGGGTGT GGTGGCGCAT ATCTGTAATC CCAGCTACTC 

3001 TGGTGGCTGA GGCAGGAGAA TCACTTGAAC CCGGAAGGTA GAGGTTGCAG 

3051 TGAGCCAAGA TCGGTGCCAC TGCACTCCAG CCTGGGCGAT AGAGTGAGAC 

3101 TCCATCTCAA AAAAAAAAAA ATTTTTTTTG TAGAGACGGG ATCTCGTTAT 
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3151 GTAGACTGGG CTCAAGTGAT CTTCCTGCCT CAGCCTCCCA AAGTGAGCCA 
3201 CCACGCCTGG TCTGAGTGTG TAATTTTGAC TCTACCTTTT TGGATGCTTT 
3251 GTAAATTGGA TAAAAGTTTC TTTACCCTGA GCTGCTTGGG CTGGTGCTAC 
3301 TGCCATTTTC AAATTTTCCA GAGTAATGTG ACATCTGGAA ACTATTTTAA 
3351 ACCATCTGTG GTAATCTGTA CCCCAACCCA ATATAGTTCA GTTCTCTGTC 
3401 GGTTTATCAG TTTCCTATTT ATCTCTTTGT ATATTTCTGC AATAAAGATA 
3451 CGAAGTTGGG AGGGGGCAAA GGAAGGCAGT TCATCTCTCT ATGTGGATGC 
3501 AGTAGCACAA TTTAATAGTA TCAAGTATTT CCATTCAGAT TGCCTTGAAG 
3551 TGGAAAGAAT GCACTTAATC CTAGCGAGAT AGGCACCTGT GTCAACAGTC 
3601 TCATCTGGAT GCTATGGGGT TTTCAAGGTA GAGAGATGTT GCAAAACTTA 
3651 TGAGTTCAGG AGTAAGGAAT GGACCAAGTT TGTCTTGATT GCGAGAGAGG 
3701 CAGACAACTG CAGTCAGCCG AGGAATATGG GTCAGAGTGT TGCAATGGGA 
3751 AGATACCTCA TC ATT AG AC A ACTAAAAAGT CTGTGAAACT AATTAAGGAT 
3801 GGAACTCACT CCTTTATAAA ATTTCATATC TGTACACATG TATAATTTTT 
3851 ATTTGTCACT TATACCTCAA TAAGGCCAAA AAAATTTTTT ATCAATAAAT 
3901 TTTTAAGTGG GGAGGAATCG ATTAGGCTCT ATCAGAGAGA ATATGGGATA 
3951 TCAATGGAAA CAGTGGCCTG AAATTTGGAG TCTAGTCTTC CGCCTGTCAT 
4001 TGACTGGTTG TGTGTTCTTG GTAAAATCTC TGAAGATGGC TTCACAGGAA 
4051 GGCATATAGA GTTCCCTCAT CTGTAAAGCA AATGGGTTAG TCTAAATCAT 
4101 GGGTCTCAAA CTCAAACACT TGCAGGGACC AGGCAGGTAT CATAAATGAA 
4151 TGAAGCAGGC CTAGTATAAG AAAAAACAGT AGCCTTGTGT GAGATGATAA 
4201 ATGGAAACAA AGTCTCAGAG AAATACTGAG GAGTAGTGAG TACCATGGTA 
4251 ATCTGAAATC TTCATGACCT GCCTGAAGGA GGTAGCCCCT CTAGAGCCCT 
;» 4301 GGCGCATTGT TTCCATGTTG GAATTCAGAC CCAGTATTGC CAGATCCACT 

^ 4351 AACTTTTCGG GAGATGCTCC CAAGACAGGA TTTTTATATG AAATGTCATG 

^ 4 4 01 ATTTTAAATT TTCACAGCTG ACTAAAACAA TAACAACAAC AACACAGGAT 

=d 4451 GGACCAAACC ATATCTGTTG GTCAGATATA ACTCAGCTGG CCTATATGCA 

Jj 4 501 TCTTTGGACT GGGTGATGTA AAGGTCCTTT ACGGTTCTAA ATCTTTGAAG 

g 4 551 TTAAGCTGTA AAAGGAAGAC CTCATCTTGA CCTTGAAACC AAGAAATTTA 

4 601 AAGTTGTGAC TACAGGAGCA AATAAACCAT TCATCCCTCC TTTTTCAAAT 
4 651 ACAATATATT GAGTTAACCA ATCGAAAACT CTCAAGATAC AAATTTCAGA 
4701 AAGTACCCAG CTGCACCCTC CCCTCTTTTT GACTTCCTTT GTTTGCTTTG 
4751 TGAACCCTCT GTGTAGAGTG TTGAGTACTG TTTTTCATTT TTGTTGTTTA 
4801 GCTTCCACTA GAAATGATTG GGAAGCATTT ATAACCTCAG GCAGCTTAGC 
4 851 CCACAGCAGA GAAAAGATAA AAACTCATAA ATTATACTCT GGATTCGCTT 
!"f 4 901 ATTTTCAAGG CCAATTACTT GTTAGATAGG TAGGAACTTG ATTAGTGTTA 

1^ 4 951 TCAGGCACAT GAAGGTGCTT GTAGAGTCTG GGTGCCTTAC ATGAAATGCA 

5001 AGCATACTTC CGAAATGAAA ATGTACTCTA ATTTATTGAA GCTTATAAAT 
5051 GGACAAACAC CCTTACTTAA ACCAGAAAAT AGCCCTGAGA ATAGAAACAG 
5101 AACATTTATG TAAATGTAAA CGGAACATTT CATGCCACCA CCTTCTCCAA 
5151 TACTGTTCTC CAATTTAGCA ATAGTACTGA TGGGTTGGGG TTAAAATCTA 
5201 AAATTTTTCA TTGAAAATGC ACTTATGCAG AACAAGAATA GGAAAAAAGT 
5251 GTTGCTTTTT CTTCTCTGTT CTTTCTTTGC ATCTTTTTCT TTCCCAGGTC 
5301 TTAGAGTTTG TCCCTAGAAG GTGACAATTT CAAACTACAT GCTTCAGAGT 
5351 GGTACACATG CATCAGTCTT AGGGTGATCT ATGGAGACTG GCAGCCAGCA 
5401 TATGTTCCAA ATTTTCCTAT CAGGAACTAA AGGCTAGAGA GCATATCAAC 
54 51 CTCTGGGCTT GTCTTTGGTC TACTTTTCTG TTAAATTTCA TTGCTGTTAT 
5501 TATTATCCTC TCCTCCCATA ATTGCTTACC CTGTATTATT TTCTTCCTTC 
5551 TTATTCTTTC ATTTACTCAG CAAATATTTC TCAAATACCT ACTAAGTGAT 
5601 AAGAGCTGTA AACAAGATAA ATACAACCCT TGACCTCAGT CTCTTGGGCA 
5651 AGACGTGTTA ATGTCCACTA CAAATGTTCT TACTAGTCAT AAGTAGTCCA 
5701 CAGTTTTTAT TCATTAAAGG TGAGTGGCGA AGTGGTAACT CAGGTGTTCC 
5751 AGTAACAAGA ATGTTCTAGT TGCTTCTCTT CCACTTACCA CATCAGAACT 
5801 GCTAAAGACT TCTGATTTGT ATGGGGGAGG TGGGAGGGGC AGAGCAGGAA 
5851 ATGTCATCTT ACCCTTATTC CAAGGATGAT AGGCTTTCAT AAGGATGTTT 
5901 TTCTCTTCGT AAAGAAAGAA TCCAGTTTAA AAGGCTTTTG TCCACAAACA 
5951 GGACAAGAGG CACAAAAAGT AACTATTACA GTGATCTTTC GAGGGCCTAG 
6001 TTATGTAGTT CATTCAGGTT TGAGTTGTCG TCTTTTAAGT ACTTTTGTTG 
6051 CTTTGATGGC TTCCTGTGTA TATGAGATAT TTTTTTTCCT CTGATCTGTC 
6101 CCAAGACTTT TTGGCTGAGA TATGGTTGTG AGCCCTTTCT TGAAAAAGCA 
6151 GAATCTGGCC AGGCGCAGTG GCTCATGCCT GTAATCTCAG CACTTTGGGA 
6201 AGCTGAGGTG GGTGGATCAC CTGAGGTCAG GAGTTCAAGA CCAGCCTGGC 
6251 CAACATGGTG AAAACCCGTC TCTACTAAAA ATACAAAAAA AAAAAAAACC 
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ATGCCTGTAA TCCCAGCTAC TCAGGAGGCT 
ACCCAGGAGG CAGAGGTTAC AGTGAGCTGA 
GCCTGGGCGA CAGAGCAAGA CTCTGTCTCA 
GAAAGAAAAA GAAAAAGCAG AATCTAAAAC 
CTTTGAGGGA GGAATGCTTT ACCTCACGAA 
CCTTTGGAAC CTTCATTATT TTGCTAGGAA 
TTTGTGTTCA AGGCACTTTT CTACCTGCCA 
CAAAACAATT CCTCGAGTCC TCAAACAAGT 
AGGTCAGTCG ATGACTGAAC AAAAATGGAT 
GAAGGCATGA TCCACCCTTT GACTTATGAG 
AGAAAAAGAC AAAAAGTAGT GCAGGCTGGC 
ATCCCAGCAC TTTAGGATCC CAGCACTTTG 
TTGAGCCCAG GAGTTTGAGA CCAGTCTGGG 
TCTACACAAA TTAAAAATAG CTGGCATGGT 
GCTACTCAGA AGGCTGAGGT GGGAGGATCA 
GCTGCAATGA ATTATGATTG TGCCACTGCA 
TAAGACCTTG TCTCAAAAAT AAAATAAAGT 
TTTTTTTCCC TCACTACAAC CTCCCTTCCC 
GCATGATGCT TTACTTCTGC AGATGTTTGG 
ACACTGTGGA AGAAAGAGGA GCAGGAGGAG 
AAAGAAGATG CTAATAGATG GCGAGGAGGT 
GCCACTCCAT CCGGACCCTG GCTATGCACC 
TCACAGATCC AAGCCTTCCT GGGCTCAGTG 
CTATGTGAGC CTGATCTCTG CAGAGGTTCC 
GTCAGGAGAG GGGAGGGCTC CAGTTAAATC 
CCCAAGCTGT CTAATAAACT GGCCACTAGC 
AAAATTAAAT AAAATTAAAA ACTTGTTCAT 
GTTCTCAGCA GCCGTGTGTT GCTAGCAACT 
TATAAACATT TCCATCATCA CAGAAAGTTC 
TAGTTAAATA ACTTGTGGAG TCAGACATCT 
AACAGGTAAG CTGTTTAGAC TAAAAATGTC 
CCCAGAAGAA GCTAGTAATA CCAGCAGTCA 
CAAAATGTTT AAATTATGCT GTTGTTTTGT 
TTAATAAGAG GTTCCCAAAT AGTACTGATC 
TTCTTTTTGA AATTATATTC ACTCCCCAGA 
TATTTCTAAA AGGTACCCAG TTGATTTTGA 
ATGATTTAAT CATTTCTGCT AATGCCAGTG 
GGGCTGGGCA CGGTGGTTCA AGCCTGTAAT 
AGGCGGGTGG ATCACAAGGT CAGGAGATTG 
ATGAAAACCC GTCTGTACTA AAAATACAAA 
CAGGTGCCTG TAGTCCTAGC TACTCGGGAG 
TGAACCTGGG AGGCGGAGCT TGCAGTGAGC 
CCAGCCTGGG TGACAGAGCA AGACTCCGTC 
AAAAAAAAAA GCGTGGGGTT AATACTAATG 
TTCTGACCTT CACTGTGATC TTTGGAGGAA 
CTAATTTCCC ACTTGTAGAA GAGGGATCCT 
CAGAAGATGA AATGTGAGTC AGTGTTTTCA 
TAATGAATTT TAACAGCCTG AGATTTGCTT 
GGTATAGGTG TGGGTACAGG TTTGGACCAT 
TGTTTGGCAA AGTCCCATGT CTCAAATAAG 
CTTTTGTCTT TTTTCCCCAC TCAGAATTGT 
TGAAGCTTTT CACATACATA GTAGTTTGAG 
ATGATGCTTT TCCTTTAAAT CATCTAATAA 
TGGGCATGAT GGCTCACGCC TGTAATCCTA 
AGGCAGATCA CTTGAAGTTG GGAGTTCAAA 
GAAACCCCAT CTCTACTAAA AACACAAAAA 
GGCTGATACC TGTAATCCCA GCACTTTGGG 
CCTGAGGTCA GGAGGTTGAG ACCAGCCTGG 
ACTAAAAATA CAAAAATTAG CCGGGTATGG 
AGCTACTTGG GAGGCTGATG CATGAGAATC 
ATCTCACCGT TGCACTCCAG CCTGGGCAAC 
AAAAAAATTC AGCCAGGCGT GGTGGTGGGT 
GGGAGGCTGA AGCAGGAGAA TTGCTTGAAC 
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ATAATTATCA 
TAAGTATATT 
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TCGCCACACA 
ATCAGCTCTT 
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CTATCCAGAT 
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CTTTCTTTCT TTCTTTCTTT CTTTCTTTCT 
TTTCTTTCTT TCTTTCTTTC TTTCTTCTTT 
TTTTCCTTTT CTTTTTCTTT CTTTTCTTTC 
TCTTTCTGTC TTCCTCCCTT CCTCCCTTTC 
TTCCCTTCCT CCTTACAGGC ATGCACCACC 
TTTTTAGTAG AGTACCGGGT TTCACCATGT 
TCCTGACCTC AGGTGATCCA CCCACCTCAG 
TACAGGTGTG AGCCACCGTG TGCACGGCTG 
GTTTTATTTT TTATTTTTTT TCTTTTTGAG 
CCAGGCTGGA GTGCAGTGGG GCAATCTCAA 
GTGGGTTCAA GTGATTCTCG TGCCTCAGCC 
AGGCATGCCA TCATGCCTGG CTAATTTTTT 
AGTCTCACTC TGTTGCCCAG GCTGGATCGC 
CTATTACCTC TGCCTCCCAG GTTCAAGTAA 
AGGTAGCTGG GAATACAGGT GCACGCCACC 
TTTTTAGCGG AGATGGGGTT TCATCATGTT 
CCTGACTTCA GGTGATCCAT CCGCCTTGGC 
CAGGCATGAG TCACCGCGCC CAGCCTAACT 
CTAATCTCAG AAGTCTTCAT TAATTCCACA 
TATGTTCCAG GTAATATGTT AGGCTATGGG 
ATGGTCCCTC CTGCCTTCAT GGAATTTTCA 
CTGAAGCTAA GTGTTCTAGA AACACACAAA 
AGATATACAT CAAAGAAGGG ACTTCTATTA 
TCTCCTAAGA CTGGATTTTT TCAGATAGAG 
GTTTGCTCCG AAGCCTGCTT CATCAGCAAA 
TGTACTCTTC TCACGTTAGT GACTTCTCAA 
TTAAGGAAGT TTATTTTGTA TATTTATATG 
TATGTTCATC ATGAGAAATT TAGAAAATAG 
TTCTAAAACT GATATAAGAC TATCACACAC 
ATTTTTTCAA TTTTTTGTGC ATCTATTTTG 
GTGTACAATG TGATGTTTCG ATGTATGTAC 
CAACCAAACT AATTAACACA TTCATCACCT 
ACGTGTGTGT GTGTGTGTGT GTGTGTGTGT 
CTCTTTAAAA ATTTCAAGTA CACAATACAT 
ATGTTGTACA TTAGAGCTCT GAAACTTATT 
GTAGCCTTTG ATCAAAATCC TTCTATTTCC 
AACCACCCAT TCTACTCTGT TGCTAGGTGT 
ATATAAGTAA GACAATGCAG TATTTTTCTT 
CTTAGCATAA TGTCCTCTAG GTTCATCTGT 
TTCTGTAATT TTATGGTTGA ATAATATTCA 
CACACACACA CACACACACA CAGACACACC 
TTCATCTGTC AACAGATACT GAGTTTGTTT 
ATAATACTAC AATGAGCATG AGAGTGCAGA 
TTCCTTTAGG TATACACCCA GCAGTGGGAT 
CTGTTTGTAA TTTTTTTGGA GAACCTCCAT 
TGTCAGTTTA TGTTCCCACA AACAGTGTAC 
CCCACCAACA CTTTTTTTTT TTAATAATAG 
TGATATCTCA TTGTGGCTTT GATTTGCATT 
TGAACACCTT TTCATATACC TGTTGGCCGT 
AAGTCTATTC AAGTGCATGC TATTTGTTTA 
CATTTGCTCT TAACTGGAGC TCTCAAGTCT 
CTCTGGGTTA TAAGTACAGC CTTCATTACC 
TTTTGTTTTT GTTTTTGTTT TTAACAGTTG 
GTATCTGTCA CCTATGGGGC CACCCTTTGC 
CAAGTACGAT GACTACAAGA TTCGCCTTGG 
TCACCATCTG GCGGACATTG GAGATCACTT 
CTCTTCTCAG CCACTTTGAA ATTGAAGGCT 
CTTCCTGATC ATCCTCTTTG AGCCCTGGAT 
CCCAGATGCC CAATAACATT GAGAAAAACT 
GTGGTCCTGA TTTCAGTCAC CATCCTCTAT 
CTGGTCAGCT TTGCAGTTGA GGTTGGCAGA 
GGCAGAACTG GGGAGATATG GGCCTGCACT 
AATGTGATCA TGGTCTTGGT TTTTAAGTTC 
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15751 TTTGGAGTGA AAGTGTTACT GAATTACTGT CATTCCTTGA TTGCCTTGCA 

15801 GCTCATTATT GCTTATCTGA TTTCCATTGG CTTCATGCTC CTTTTCTTCC 

15851 AGTACTTGCA TCCATTGCGC TCACTCTTCA CCCATAATGT AGTAGACTAC 

15901 CTCCATTGTG TCTGCTGTCA CCAGCACCCT CGGACCAGGG TTGAGAACTC 

15951 AGAGCCACCC TTTGAGACTG AAGCAAGGCA AAGTGTTGTC TGATTCTATT 

16001 TTCTGGGTAT TTTAGGAAGA GTTGGGAGTT GCCAAGAGTA ACCATGAAAT 

16051 TGAACGAAAG GATGAGGTTC ATGGGTGAGA TACCCATCAG TACATTTTCT 

16101 TGACTTTTCT GTTAAGCCTA T C AG AAG AAA GAGCAACTCC CAAATAGGTT 

16151 TTATTTTCTT AAGAGTTACC ACTATGTTTG GAAACAGGGG GTATCGACTA 

16201 TATAGTTGAA AGGGTCAGAA ATACCATTCA CACCCTTCTT ACCCAAGTCA 

16251 ATTGGAATAA CTTGTCTTCA AACACTTTAG GCTCTCTAAA GTGACCTTCT 

16301 AGCTCTGCTC ATTTGCTTGA TGCATTTCTG AGCTTTCCTG GGCTGAGCTG 

16351 AAGGCCCAGA ATCCCGCTAG AATATATCCT GACTGATCAG AGGATATGAC 

16401 AGCTTACCAG CTAAGAGTAC CTCCCAGGAA ACAGTCTGAC TAATGTGGAA 

16451 CCTGCAACTG TCAGTGTGGC TGGGGTCTTT TTAATTCCAG TGAGAAGCTC 

16501 TGGCTGAGAA GAAAATCACC ACTATTAAAA AAGCTGCTCC CCAAGCAGAT 

16551 TAGCTCTCTG TTAGGATTTT ACTAGTGGCC ATTCAGCAAG GACCTCTCTT 

16601 TACAGTGGCA CTTCATAGGC ACACTCTAAG GAGAAAGTGC AGAGTAGAAT 

16651 TCCTTCAGGG CATAAGCCAA AATGACTCTT TTTCTCAGGG ACCTGCATGG 

16701 GCCTCCAGCT TGTCTATTGG AATTGTTAAG TGAAGCCTCT CACTTAGTGC 

16751 CTCATTAGCA GAGATTTCCT CCAACCCAGC TTTTCTGTGC TCTTGGTATT 

16801 TTACTACTTG ATGTGGACCT CAGAGAAGCT GAACTGTAAT TGAAAATGTT 

16851 TCCGATGTGT GGAAGAAATG AAGACTGCTT TGTGTCTGCT GTTGTCCTGA 

16901 GTATTTCATT AATGTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTATGTG 

16951 TATGTGTGTA GGGAAGAAAG TAATAATGGC TGAGACATCA CCTTCATGTT 

17001 GTTTGCGATT GGGATGGGTG ACTAACACTC CAAGGTAGAG TGAAGGCAGA 

17051 GGAGGGAAAC AAG AT C AC AT TAAATCATCA TCAGTACTGG TTTCTGCCTA 

17101 CAGGAGTTTA CTTTTTTTTT TTTTCCTTTT TTGAGATGGA GTCTCGCTCT 

17151 GTTTTCTAGG CTGAAGTGCA GTGGTGTGAT CTTGGCTCAC TGCAGCCTCT 

17201 GCCTCCTGGG TTCAAGCAGN NNNNNNNNNN NNNNNNNNNN NNNNAGTGAT 

17251 CCACCCGCCT CGGTCTCCCA AAGCACTGGG ATTACAGGCA TGAGCCACCT 

17301 CACGCGGCCA GGATTTTACT TTATAACAAG GAACATATGT TTATCAACCC 

17351 TCTGTTCGTT CCTATACCCC CAGTGGACGA ATGCATGTCT CCTTTTCTCC 

17401 TATATCTCAA TGTTTACATC TCATATCAGT TGGGTATTTT GATAGGAATG 

17451 TCAGCCAGCT ACCTCTGAGG TAACCAAGGG ATTGAAGTTA CTATGGCCAC 

17501 TGCCTATTGG GACCAAATAT CCCAGCATTT ACCTAACTAA TGCTTGCCCC 

17551 TCACAGACCA GGAAAATTAA AAGAACTCCT AGTCGTGGCC ACCACAACAC 

17 601 TTCAAGAAAT TGTGAACAAT CTGACCTAGG GCTTCCTGTC CTCATCCAAT 

17651 TTTACTCTTG GTAGCATGCT AAGAATTTAT CTTTAGTCAT TTCCTCTCCT 

17701 CTTATCCAAT GTCAGGACAT TATGTTGAGG GAGTTCTCTC TTCTAAGTAG 

17751 CAGGGCTGTT AACCAAAGTA TCTTATTTCT TGGCATGGCT AGCATGGTTT 

17801 TCCCTTCATC AGCCACTGTT TGGGACTAAA AGGATTATAT ACTTAATTTG 

17851 GGAGAGACTG TATGGACTTG CTTTGGAACA GTGGAGAGCT CCTTTCTTCA 

17901 ACCCCAACTC CCCCATTCCA TTTTTCATGA TG AAG AG ACT TAGTTATTGT 

17951 CATATAAAGC TCACCTGCTG TCTTCTAACT ATGTTATTCA AGG 
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Gene Structure 
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Start: 2001 
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Start: 1962 
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Exon: 15248-15993 
Stop: 15991 



CHROMOSOME MAP POSITION: 

Chromosome 23 

ALLELIC VARIANTS (SNPs) : 

DNA 



Position 


Major 


Minor 


Domain 


2584 


G 


C 


Intron 


2655 


A 


T 


Intron 


3693 


G 


A 


Intron 


3992 


G 


C 


Intron 


6285 




A 


Intron 


7066 


A 


T 


Intron 


14223 




T G 


Intron 


16915 




G T 


Beyond ORF(3*) 



Context : 



DNA 

Position 

2584 ATAAACCGCTATCATTATTTATGCATCTAATCCTCTTGGGACCTGTTATCAGGTGAGCAA 
CTTTTAAATCTTTTCCTTACCCCCCTAACCCCACCCCAGACTTGGGCAGAGAAAGATGAA 
AGATTTACAAGATGGATACTATGGCTCTAATCAATTCTCTCATTTCCTCCCACTCTCGGC 
TTCCCTGTCTACCATTCAGAAAACTTACCTGAAATCTTAAATGCCACCATGATGAACATG 
TGGTATGTACTTGTGTTCCAAAACAATGAACGATGCTATTTGGGCTGTGTAAACTAGAAT 
[G,C] 

GGAACAACAAGACGTGATCACCCTGTGCATGAAGGCCATAGCTGCAGAGTGTGTAATTTT 
ATTTA7\AAAAATTTTTTTTTCTGAGACAAGGTCTTGCTCTGCCTCCCAGGCTACAGTGCA 
GTGGTGCGATCATGGCTCACTGCAGCCTTGATCTCCTGGGATCAAGCGAACCTCCCACCT 
CAGCCTCCAAGTAGCTGGGACCAAAGGAATGTGTCACCATGCCTGGTTAATTAAAAAAAA 
ATTTTTATAGGCCGGGTGTGGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCTGAG 

2655 TTTCCTTACCCCCCTAACCCCACCCCAGACTTGGGCAGAGAAAGATGAAAGATTTACAAG 
ATGGATACTATGGCTCTAATCAATTCTCTCATTTCCTCCCACTCTCGGCTTCCCTGTCTA 
CCATTCAGAAAACTTACCTGAAATCTTAAATGCCACCATGATGAACATGTGGTATGTACT 
TGTGTTCCAAAACAATGAACGATGCTATTTGGGCTGTGTAAACTAGAATGGGAACAACAA 
GACGTGATCACCCTGTGCATGAAGGCCATAGCTGCAGAGTGTGTAATTTTATTTAAAAAA 
[A,T] 

TTTTTTTTTCTGAGACAAGGTCTTGCTCTGCCTCCCAGGCTACAGTGCAGTGGTGCGATC 
ATGGCTCACTGCAGCCTTGATCTCCTGGGATCAAGCGAACCTCCCACCTCAGCCTCCAAG 
TAGCTGGGACCAAAGGAATGTGTCACCATGCCTGGTTAATTAAAAAAAAATTTTTATAGG 
CCGGGTGTGGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCTGAGGCGGGTGGATC 
ACCTGAGGTCAGGAGTTCAAGACCAGCTGGCCAACATGGTGAAACCCCTGTCTCTACTAA 

3693 TCTCTGTCGGTTTATCAGTTTCCTATTTATCTCTTTGTATATTTCTGCAATAAAGATACG 
AAGTTGGGAGGGGGCAAAGGAAGGCAGTTCATCTCTCTATGTGGATGCAGTAGCACAATT 
TAATAGTATCAAGTATTTCCATTCAGATTGCCTTGAAGTGGAAAGAATGCACTTAATCCT 
AGCGAGATAGGCACCTGTGTCAACAGTCTCATCTGGATGCTATGGGGTTTTCAAGGTAGA 
GAGATGTTGCAAAACTTATGAGTTCAGGAGTAAGGAATGGACCAAGTTTGTCTTGATTGC 
[G, A] 

AGAGAGGCAGACAACTGCAGTCAGCCGAGGAATATGGGTCAGAGTGTTGCAATGGGAAGA 
TACCTCATCATTAGACAACTAAAAAGTCTGTGAAACTAATTAAGGATGGAACTCACTCCT 
TTATAAAATTTCATATCTGTACACATGTATAATTTTTATTTGTCACTTATACCTCAATAA 
GGCCAAAAAAATTTTTTATCAATAAATTTTTAAGTGGGGAGGAATCGATTAGGCTCTATC 
AGAGAGAATATGGGATATCAATGGAAACAGTGGCCTGAAATTTGGAGTCTAGTCTTCCGC 

3992 CGAGAGAGGCAG AC AACTGCAGTCAGCCGAGG AATATGGGTCAGAGTGTTGCAATGGG AA 

GATACCTCATCATTAGACAACTAAAAAGTCTGTGAAACTAATTAAGGATGGAACTCACTC 
CTTTATAAAATTTCATATCTGTACACATGTATAATTTTTATTTGTCACTTATACCTCAAT 
AAGGCCAAAAAAATTTTTTATCAATAAATTTTTAAGTGGGGAGGAATCGATTAGGCTCTA 
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TCAGAGAGAATATGGGATATCAATGGAAACAGTGGCCTGAAATTTGGAGTCTAGTCTTCC 
[G,C] 

CCTGTCATTGACTGGTTGTGTGTTCTTGGTAAAATCTCTGAAGATGGCTTCACAGGAAGG 
CATATAGAGTTCCCTCATCTGTAAAGCAAATGGGTTAGTCTAAATCATGGGTCTCAAACT 
CAAACACTTGCAGGGACCAGGCAGGTATCATAAATGAATGAAGCAGGCCTAGTATAAGAA 
AAAACAGTAGCCTTGTGTGAGATGATAAATGGAAACAAAGTCTCAGAGAAATACTGAGGA 
GTAGTGAGTACCATGGTAATCTGAAATCTTCATGACCTGCCTGAAGGAGGTAGCCCCTCT 

TCTTTCGAGGGCCTAGTTATGTAGTTCATTCAGGTTTGAGTTGTCGTCTTTTAAGTACTT 
TTGTTGCTTTGATGGCTTCCTGTGTATATGAGATATTTTTTTTCCTCTGATCTGTCCCAA 
GACTTTTTGGCTGAGATATGGTTGTGAGCCCTTTCTTGAAAAAGCAGAATCTGGCCAGGC 
GCAGTGGCTCATGCCTGTAATCTCAGCACTTTGGGAAGCTGAGGTGGGTGGATCACCTGA 
GGTCAGGAGTTCAAGACCAGCCTGGCCAACATGGTGAAAACCCGTCTCTACTAAAAATAC 
[-,A] 

AAAAAAAAAAAAACCTTAGCCGGACATGGTGGCACATGCCTGTAATCCCAGCTACTCAGG 
AGGCTGAGGCAGGAGAATCGCTTGAACCCAGGAGGCAGAGGTTACAGTGAGCTGAGATCG 
CGCCAGTGCACTCCAGCCTGGGCGACAGAGCAAGACTCTGTCTCAAAAAAAAAAAAAAAA 
AGAAAGAAAGAAAAAGAAAAAGCAGAATCTAAAACTTTGGTTATGGAGCTGAATGCTTTG 
AGGGAGGAATGCTTTACCTCACGAATTTGAGGTAAGAAAACAGGGCCTTTGGAACCTTCA 

TTGTAGAAGGCATGATCCACCCTTTGACTTATGAGAAATGATCAGAACAGAAGAGAGAAA 
AAGACAAAAAGTAGTGCAGGCTGGCCATGGTGTCTCACACGTGTGATCCCAGCACTTTAG 
GATCCCAGCACTTTGGGTCAAGGCAGTAGGATTGCTTGAGCCCAGGAGTTTGAGACCAGT 
CTGGGCAACATGTCTAGATCTCCTCTCTACACAAATTAAAAATAGCTGGCATGGTGGCAT 
GCGCCTGTAGTCCTAGCTACTCAGAAGGCTGAGGTGGGAGGATCATTTGAGCCTAGGAGG 
[A,T] 

CAAAGCTGCAATGAATTATGATTGTGCCACTGCACTCCAGCCAGGGTGATGGAGTAAGAC 
CTTGTCTCAAAAATAAAATAAAGTAGCACAACCTCCCCAAGTTATTTTTTTCCCTCACTA 
CAACCTCCCTTCCCAGGACAGCTTAGTTAAGTTTGCATGATGCTTTACTTCTGCAGATGT 
TTGGAGGCCATGATTAAGTACCTCACACTGTGGAAGAAAGAGGAGCAGGAGGAGCCCTAT 
GTCAGCCTCACCCGAAAGAAGATGCTAATAGATGGCGAGGAGGTGCTGATAGAATGGGAG 

AAGGAAGTTTATTTTGTATATTTATATGATTATTAAAGTGTTACAGTATATGTTCATCAT 
GAGAAATTTAGAAAATAGAGAAATGTAGAGAAAAAGATTTCTAAAACTGATATAAGACTA 
TCACACACAAAAAAAGATATTTTGGTTCATTTTTTCAATTTTTTGTGCATCTATTTTGTT 
TTATTGTATATATTCAAGGTGTACAATGTGATGTTTCGATGTATGTACACATTGTGAAAT 
GATTACCACAACCAAACTAATTAACACATTCATCACCTCACATAGTTATCATTTTTGTAC 
[-,T,G] 

TGTGTGTGTGTGTGTGTGTGTGTGTGTGGTAAAACTTAAGATCTACTCTCTTTAAAAATT 
TCAAGTACACAATACATTATTGTCAACTATAGTCATCATGTTGTACATTAGAGCTCTGAA 
ACTTATTTATCTTATAACTCTAAATTTGTAGCCTTTGATCAAAATCCTTCTATTTCCCTA 
AATCCCCATCCCCTGGTAACCACCCATTCTACTCTGTTGCTAGGTGTTCAACTTTTTTAG 
ATTCCACATATAAGTAAGACAATGCAGTATTTTTCTTTATGTGTCTAGCTCATTTCACTT 

ATAGGCACACTCTAAGGAGAAAGTGCAGAGTAGAATTCCTTCAGGGCATAAGCCAAAATG 
ACTCTTTTTCTCAGGGACCTGCATGGGCCTCCAGCTTGTCTATTGGAATTGTTAAGTGAA 
GCCTCTCACTTAGTGCCTCATTAGCAGAGATTTCCTCCAACCCAGCTTTTCTGTGCTCTT 
GGTATTTTACTACTTGATGTGGACCTCAGAGAAGCTGAACTGTAATTGAAAATGTTTCCG 
ATGTGTGGAAGAAATGAAGACTGCTTTGTGTCTGCTGTTGTCCTGAGTATTTCATTAATG 
[-,G,T] 

GTGTGTGTGTGTGTGTGTGTGTGTGTGTGTATGTGTATGTGTGTAGGGAAGAAAGTAATA 
ATGGCTGAGACATCACCTTCATGTTGTTTGCGATTGGGATGGGTGACTAACACTCCAAGG 
TAGAGTGAAGGCAGAGGAGGGAAACAAGATCACATTAAATCATCATCAGTACTGGTTTCT 
GCCTACAGGAGTTTACTTTTTTTTTTTTTCCTTTTTTGAGATGGAGTCTCGCTCTGTTTT 
CTAGGCTGAAGTGCAGTGGTGTGATCTTGGCTCACTGCAGCCTCTGCCTCCTGGGTTCAA 
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